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1.  Introduction 

Simulation  of  various  physical,  biological  or  social 
complex  systems  allows  us  to  develop  elaborate  models  for 
them  and  helps  in  the  process  of  making  valid  inferences 
from  them.  There  are  many  situations  tn  which  systems  can 
not  be  easily  described  in  a  compact  form  for  analysis  and 
prediction.  The  modern  computer  simulations  allow  us  to 
represent  such  systems  by  series  of  simpler  models  and  thus 
help  us  in  providing  reasonable  solutions  to  complex  pro¬ 
blems  . 

A  schematic  representation  of  the  simulation  strategy 
for  aeveloping  models  of  complex  systems  has  been  given  by 
Ziegler  et.  al.  (197S). 

Model 


Validat ion 


Figure  1 


Object  Specification 
Available  Knowledge 


Simulation 


The  validation  of  models  requires  some  sort  of  optimiza¬ 
tion.  One  has  to  provide  criteria  of  optimization  and 
possible  techniques  to  achieve  that  optimization  to 
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complete  the  process  of  validation. 

Optimizing  techniques  are  also  required  in  other 
aspects  of  simulation  experiment  such  as  in  their  design 
and  ultimate  analysis.  The  method  of  optimization  form  a 
vast  body  of  knowledge  spreading  over  several  fields.  We 
have  classified  optimizing  methods  broadly  in  the  follow¬ 
ing  categories  and  have  arranged  the  list  of  references  in 
that  order. 

A.  Classical  optimizing  techniques; 

B.  Numerical  procedures; 

C.  Mathematical  programming  methods; 

D.  Stochastic  approximation  methods; 

E.  Optimum  seeking  methods  and  response 
surface  methodology; 

F.  Optimal  Design  Theory; 

G.  Miscellaneous  methods. 

In  this  brief  account,  we  emphasize  those  optimiza¬ 
tion  techniques  which  are  of  potential  use  in  simulation 
methodology.  We  shall  concentrate  here  on  Optimizing 
Criteria,  Classical  Methods,  Numerical  Methods,  Optimal 
Search  Procedures,  Response  Surface  Methods  and  Optimal 
Designs  of  Regression  Experiments. 

However,  technical  references  are  provided  on  various 


other  optimization  techniques  for  the  interested  reader. 
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2 .  Optimizing  Criteria 

Optimization  is  basically  dependent  on  the  criteria 
used  in  a  given  situation.  The  same  problem  may  lead  to 
different  solutions  depending  upon  the  criteria  of 
optimality  utilized.  The  criteria  depend  on  the  nature 
of  the  problem  and  are  many  times  dictated  by  practical 
considerations.  Consider  the  case  of  least  square  estima¬ 
tion  of  parameters  in  hypothesized  models.  The  criterion 
of  minimizing  the  sum  of  squares  of  residuals,  was 
dictated  more  as  a  mathematical  convenience  than  from 
heuristic  point  of  view.  It  allows  simple  mathematical 
solution  is  most  cases.  However,  if  the  criterion  of 
optimality  is  chosen  to  be  that  of  minimizing  the  sum  of 
absolute  deviations  of  residuals,  the  mathematical  simpli¬ 
fication  is  minimal  and  recourse  has  to  be  made  to  numeri¬ 
cal  solutions.  It  may  be  highly  important  to  select  the 
"right"  criterion  of  optimality  in  a  given  situation. 

There  does  not  seem  to  be  a  simple  and  logical 
approach  of  choosing  among  a  class  of  competing  criteria 
of  optimality  for  a  given  problem.  Experience  and 
intuition  in  a  given  setting  may  be  the  ultimate  judge 
for  proper  selection.  In  many  situations,  however,  more  is 
known  about  the  comparative  properties  of  the  optimality 
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criteria  and  the  experimenter  is  guided  by  such  considera¬ 
tions  to  select  the  appropriate  criterion.  We  shall  dis¬ 
cuss  some  of  the  most  commonly  used  criteria  in  this 
section. 


Least  Squares  Criterion 

One  of  the  most  common  criterion  used  in  validation 
of  models  is  that  of  least  squares.  Given  the  realization 
of  the  process  from  simulations  or  actual  observations, 
the  observed  and  the  expected  value  under  the  assumed 
models  are  compared.  If  the  sums  of  squared  deviation  is 
minimized,  this  method  provides  the  unknown  parameters  of 
the  model.  Various  other  criteria  such  as  sum  of  absolute 
deviations  or  weighted  least  squares  criterion  are  also  in 
use.  The  criterion  to  be  chosen  heavily  depends  on  the 
experimental  situation. 


Example  (Milstein  (1979)) 

In  a  biochemical  process,  the  equations  of  the  pro¬ 
cess  are  described  by  the  following 


d  x 

TT  =  f (x ,  k) , 
dt  'u  \ 

x  ( 0  )  =  c . 

-v  -Vi 


i  =  1,  2,  . . . , 


and  the  vector  x  is  n-dimensional  with  nonnegative  compo¬ 
nents  ,  k  is  a  vector  of  parameters  having  p  unknown 
components,  f  is  a  vector  function.  The  vector  c  re¬ 
presents  the  given  initial  condition.  Let  the  data  be 
given  by  y  (tr)  at  rth  time  point  t  and  let  the  corres¬ 
ponding  value  of  x  be  given  by  x(k,  t  >.  Let  W  be  the 

%  I"1  •vr* 

matrix  of  known  weights,  then  a  common  measure  of  the 
discrepancy  between  the  data  points  y  and  the  trajectories 
can  be  the  following 


F(k)  = 

% 


l 

l 

s=l 


M 


(tr>  " 


>>  3^rty£ 


Cxr)  - 


% 


tr> 


M  is  the  number  of  points  chosen.  The  object  will  be 

to  determine  the  unknown  parameters  k  which  can  be  ob- 

tained  by  using  the  criterion  of  minimizing  F(k). 

% 

A  computer  algorithm  is  given  in  terms  of  an  iterated 
numerical  procedure  starting  with  a  first  guessed  value  of 

k  by  Milstein  (1979). 

v 

In  the  context  of  design  of  experiments,  which  are 
highly  pertinent  to  the  simulation  experiments,  we  discuss 


a  few  criteria  which  are  in  commonly  use. 
Consider  the  model. 
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where  y  is  the  observation  vector  in  n-dimensions ,  X  is 

X  X 

£  n  x  p  design  matrix,  8,  fc  p  x  1  vector  of  unknown  para- 

x 

meters  and  e ,  C  n  x  1  vector  of  residuals*  If  we  use 
least  squares  method  to  estimate  B,  it  is  well  known  that 
we  optimize  e'e  =  (y-XB) ' (y-XS)  leading  to  the  optimal 

•V  %  X  XX  X  XX 

estimates  of  8  as  given  by 

'V 

£  =  (X'XJX’"1  £ 

In  the  problem  of  finding  optimum  X  such  that  the  para- 
meter  B  is  estimated  optimally,  one  considers  the  covari- 

A 

ance  matrix  of  8  given  by 
% 

V(£)  =  (X’X)'1o2 

where  e  is  assumed  to  have  means  zero  and  covariance  ozI. 

'V  'V 

By  an  experimental  design,  we  mean  the  choice  of 
levels  of  X.  Consider  the  case  in  one  dimension  for 
present  and  assume  that  there  are  n  observations  available. 
We  are  interested  in  knowing  the  method  of  allocation  of 
these  observations  to  the  various  levels  of  x's.  That  is, 
the  problem  is  find  levels  x^,  Xj»  ...»  to  be  repeated 
r.^ ,  ...»  times  such  that  +  n2  +  **•  4  nk  =  n* 

The  set  of  x^'s  with  n^'s  is  called  the  design  of  an 
experiment .  In  place  of  integers  n^,  we  can  use  fractions 


Pi »  P2  *  •  •  •  *  P^ 
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with  — -  =  p •  and  Tp.  =  1.  The  collection  of  x-'s  with 
p^'s  describes  generally  a  discrete  probability  measure. 
The  theory  of  optimal  design  of  experiments  is  concerned 
with  obtaining  such  a  measure  so  as  to  optimize  some 
objective  function  of  the  parameters  in  the  assumed  model 
for  the  experiment . 

There  are  several  optimality  criteria  in  the  case  of 
regression  design  of  experiments  and  they  are  given  in 
terms  of  the  matrix  X'X.  Suppose  X  =  C^,  x2  ,  ...»  £n), 
with  xi,  i  =  1,  2,  .  n  being  p-vectors  and  let  e  X. 

Criterion  of  G-Optimality 

It  is  also  known  as  the  criterion  of  minimax 
optimality. 

Find  x^  such  that 

min  max  {X' CX'X)_1X} 

X  -  X  E 

\,1  -v 

i  =  l,  2 ,  . . .  ,  n 

Criterion  of  D-Optimality 

In  this  criterion,  we  find  such  that  determinant 
of  the  matrix  X'X  is  maximized-  That  is,  find  x^,  such 


that  we  have 
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max  det  (X'X) 


Cr it er ion  of  A-Optiroality 


Find  x.  such  that 


min  trace  (X'X)-1 


Criterion  of  E-Optimalit\ 


This  criterion  is  concerned  with  finding  x^  such  that 

minimum  eigenvalue  of  X'X  is  maximized.  That  is, 

max (min  eigenvalue  of  X'X) 
xi 

Many  other  kinds  of  optimality  criterion  in  the  context  of 
design  of  regressicn  experiments  have  been  discussed  in 
the  literature,  fcr  reference,  see  Federov  (1972). 


Integrated  Mean  Square  Error  Criterion 


Rcently  Brown  (1979)  has  proposed  the  integrated 
mean  square  error  as  an  optimization  criterion  in  the 
context  of  linear  inverse. 

This  criterion  has  been  used  in  other  contexts  as 


well,  see  Tapia  and  Thompson  (1978).  A  common  measure 
of  discrepency  between  the  observed  and  expected  value 
is  obtained  in  terms  of  mean  squared  errors  (MSE). 


MM 
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Consider  the  model, 

E(Y | x)  =  a  ♦  6x 
and  V(Yjx)=o2 

Let  L  <_  x  U ,  be  the  interval  of  possible  x  values.  The 
MSE(x)  is  the  mean  squared  error  of  x  as  obtained  from  y. 
Let  w(x)  be  a  weight  function.  Then  Integrated  Mean  Square 
Error  is  defined  as 

U 

IMSE  =  /  KSE(x)  W(x)dx 
L 

In  calibration  problems.  Brown  has  shown  that  optimization 
of  IMSE  gives  much  better  results  as  compared  to  simply 
minimizing  MSE.  In  case,  no  special  form  of  the  weight 
function  W(x)  is  suggestible  from  the  problem,  W(x)  may  be 
taken  to  be  uniform  over  the  range  (L,  U). 


3.  Classical  Methods  of  Optimization 


The  basic  problem  of  optimization  is  concerned  with 
finding  a  value  in  a  finite  dimensional  set  A,  for  which 
a  function  f(x>  defined  on  the  set  A,  attains  a  maximum  or 
a  minimum.  If  A  is  a  finite  set,  the  minimizing  and 
maximizing  values  always  exist.  They  need  not  exist  when 
A  is  not  finite. 


10 


II  ,  x  =  0  » 

x  ,  x  >  0  . 

Then  the  function  f(x)  defined  our  x  _>  0»  the  non-negative 
part  of  the  real  line  does  not  have  a  minimum  which  can  be 
attained.  The  ideas  of  infimum  and  supremum  are  introduced 
to  take  care  of  such  a  situation. 

Define  supremum  of  f(x)or  sup  f(x)  by  the  least 
value  of  A  such  that 

f(x)  <_  A.  for  all  x  eA. 

Similarly  infimum  of  f(x)  or  inf  f(x)  is  defined  by  the 
largest  value  A  such  that  f(x)  >  A. 

An  important  result  in  this  regard  is  given  by  the 
following  theorem. 

Theorem  3.1.  If  f(x>  is  continuous  and  the  set  A  is 
finite  and  closed  interval  then  f(x)  attain  its  minimum  or 
maximum  (extrema)  values  in  A. 

For  proof,  see  any  book  on  calculus,  for  example, 
Whittle  ( IS 7 1 ) . 

The  necessary  and  sufficient  conditions  for  extrema 
are  given  by  the  following  theorems,  usually  available  in 
standard  calculus  books. 
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Theorem  3.2.  (Necessary  Conditions  for  an  extremum) 

a  f 

If  the  derivative  ~  exist  at  an  interior  point,  xn,  of 


9x 

the  set  A,  and  if  x«  is  an  extremum  point,  then 


9f 


0  at 


x  =  x 


0* 

Define  the  Hessian  of  a  function  f(x)  by  the  matrix 
of  second  order  partial  derivations  as  follows. 


H  = 


32f 


3x1 


32f 

Sx'j^x^ 


3  2  f 


32f 


*  37 


32f 


3x 


1  xn 


9  2  f 


32f 

3x^x7 


9  2  f 

37~37T 

n  i 


32  f 


3x 


n 


Theorem  3.3.  (Sufficient  Condition  for  an  extremum) 
The  sufficient  condition  that  f(x)  has  a  maximum 
(minimum)  at  an  interior  point  xQ  e  A  is  that  H  exist  and 
be  negative  definite  (positive  definite). 

The  proofs  require  expanding  the  function  f(x)  with 
the  help  of  Taylor's  theorem  using  H.  For  details  see. 
Whittle  (1971). 


Constrained  Optimization 


In  finding  extrema  of  a  function  f(x)  over  the  set  A, 
these  may  be  additional  constraints  added  such  as  by  the 
condition,  g(x)  =  b.  Essentially  the  constraints  introduce 
a  subset  of  the  set  A  over  which  f(x)  should  be  optimized. 
The  case  when  the  constraints  are  introduced  by  inequali¬ 
ties  is  dealt  with  by  mathematical  programming  methods. 

The  method  of  Lagrange  multipliers  has  been  used 
extensively  for  solving  constrained  optimization  problems. 
The  method  requires  optimizing 
f<x)  ♦  Ag(x) 

where  A  is  some  unknown  constant.  If  the  number  of  con¬ 
straint  equation  is  more  than  one,  Lagrange's  method  re¬ 
quires  optimizing 

f(x)  +  A'^(x) 

where  ^  is  the  vector  of  function  given  and  the  vector  A 
is  unknown.  For  an  extensive  discussion,  see  Whittle 
(1971) . 
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4 .  Numerical  Methods  of  Optimization 


By  the  very  nature  of  the  simulation  process,  numeri¬ 
cal  methods  are  necessary  for  optimizing  techniques  for 
simulation  models.  In  the  case  of  functions  of  one 
variable,  it  may  sometimes  be  easy  to  graph  the  function 
and  then  obtain  the  optimizing  value.  In  the  case  of 
several  variables ,  the  process  involves  large  numbers  of 
calculation  and  may  exceed  the  limit  of  computers. 

The  optimization  of  functions  in  many  cases  reduces 
to  finding  the  solutions  of  equations  since  the  extremi- 
zing  values  are  given  by  the  derivatives  or  partial 
derivatives  if  they  exist.  We  first  consider  methods  of 
solving  an  equation  of  the  type, 

f ( X )  =0  (4.1) 

General  methods  for  solutions  are  available  in  textbooks 
of  numerical  analysis,  for  example  see  Ralston  (1965).  We 
first  define  Lagrange  polynomials  which  are  used  in  inter¬ 
polation.  Lagrange  polynomial  of  (n-l)-th  degree  are 
defined  by 


Pn(x) 

£.  (  X  )  —  - r - 1 — , - p- 

1  Cx-ap^TaTT 


3=1,2,.. 


(4.2) 


where 

Pn(x)  =  (x-a^Cx-aj)  ...  (x-an>  (4.3) 


is  a  polynomial  of  nth  degree  with  given  contents  a ^ 


V 


p^Ca^)  gives  the  derivative  of  the  polynomial  p^Cx) 
at  .  For  example,  Lagrange  polynomials  of  order  3  are 


given  by 


M*>  = 


(x-a2 ) (x-a3 ) 

.  -32  }  (  a  ^  -  a  3 ' 

(x-a^)(x-a2> 


c2(x)  2  ra:-a"T^2“7r  (4-5 

(x-a, ) (x-a~ ) 

*3(x)  =  la13-a1>(a'3-a  (4,6 

Iterative  procedure  for  roots  of  the  equation,  f(x)  -  0. 

Suppose  inverse  of  the  function  f  exists.  Let 
y  =  f(x)  so  that  x  =  f-1(y)  =  g(y).  We  are  looking  for 
g(Q)  which  will  be  the  root  a.  That  is,  g(0)  :  a. 

The  Lagrange  interpolation  formula  gives  an  approxi 
mation  for  g(y)  by  h(y),  denoted  by,  g(y)  ^  h(y). 


h(y)  =  l  l- (y)g(y^ ) 
j  =  l  3  3 


*j<y)x±-j+l  (4’7 

where  g(y^)  =  xi_3  +  i»  gi-ven  the  points,  y-^,  y2»  •  ••»  y. 
An  approximation  of  a  by  x.+1  is  given  by  h(0).  That  is 


S5S 
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Notice  that 


.  m  .  C-l)ny1y2:..yj.1yj,1...yn 

j (  <yj‘y1‘>(yj-y2>...<yj-yj.1Hyj-yj  +  1>.-  -(yj-yn> 

(4.9) 

The  equation  (4.9)  gives  an  n-point  iteration  process. 

That  is  given  ,  xi.i »  •••»  ^-(n-l)’  we  can  xi  +  l* 

Or  the  n-point  iteration  function  is  given  by 


xi+l  =  Fi(xi’  xi-l*  xi-(n-l)) 


(4.10) 


Most  iteration  procedures  use  only  one  point  iteration 
and  the  same  function  for  iteration.  That  is, 


i* 

i 


xi+i  =  F(x.) 


(4.11) 


There  are  many  methods  of  iteration.  We  shall  discuss 
here  the  most  commonly  used  methods  such  as  those  of 
Newton-Raphson . 


I 

I 

I 
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Newton-Raphson  Procedure 
In  this  procedure  ,  we  use 

f(x£) 

xi+l  =  xi  "  T 1ST) 

Using  the  approximation  of  f*(x^)  by 

f (x . )-f (x  .  ,  ) 

f'Cx,)  %  - | 

1  xi  xi-l 

f(x.)(x^-x^  j,) 

we  have  x.  +  1  =  x.  -  f  (x,  ).f ft-  8 


(4.12) 


(4.13) 


The  above  two-point  iteration  is  known  as  the  secant 
method . 


In  the  case  of  several  equat  ions  ,  a  generalized  cr.e- 
pcint  Newton-Raphson  iteration  procedure  can  be  similarly 

xj 

described.  Let  x  -  (  )  be  a  two  dimensional  vector.  Let 

2 

f .  (x )  =  0  and  f ^ (x)  =  0 

t  -c  *  'V 


be  the  two  simultaneous  equations  to  be  solved.  Then  the 
Netwon-Raphson  iteration  requires  the  following: 


*i*l 


=  ' 


*fl 

fl(^l> 

3x1 

3Xj 

3f2 

3f2 

f2(x2) 

9x^ 

3x„ 

(4.15) 


x=x . 

a-  %l 


Gradient  Method 

Gradient  method  was  introduced  by  Cauchy  in  1847. 

This  method  utilizes  the  gradient  of  the  function  f(x) 
given  by  £(x)  =  (r~-}  •j—-  ,  ....  3—  )'•  The  gradient 
represents  the  direction  cosines  of  the  normal  to  the 
tangent  hyperplane  at  point  x  of  the  surface  f(x).  The 
method  utilizes  steepest  ascent  for  a  maximum  and  steepest 


descent  for  a  minimum,  to  increase  the  speed  of  approach 

to  the  optimum.  Consider  the  matric 


d2  =  (x-^)'  B(x-^) 

where  B  is  a  given  matrix  and  x  and  y  are  any  two  vectors. 
Then  the  direction  of  steepest  ascent  is  the  direction 
from  the  point  to  the  ellipsoid 

(x-Xq)*  B(x-x0)  =  k2. 

The  following  theorem  gives  an  explicit  form  for  optimi¬ 
zation. 

Theorem  4.1.  For  a  function  f  (x)  ,  the  maximum  occurs 
in  the  direction  5(Xq)  given  by  6(x0)  =  B~'l(£(x0))  where 
Mx0)  is  the  gradient  of  f(x)  at  xQ.  For  proof  and  other 
relevant  material  the  reader  is  referred  to  Crockett  and 
Chernoff  (1955) . 

5 •  Optimal  Search  Procedures 

In  optimum  seeking  methods,  the  aim  is  to  design  the 
most  economic  or  shortest  time  consuming  procedure. 

Suppose  a  function  is  to  be  explored  over  the  points 
x^,  Xj ,  ••.,  xR.  Let  0  x^  <_  1.  Consider  the  following 
two  situations  with  n  =  3  in  Figures  1  and  2,  where  the 
values  of  the  function  are  given  by  verticle  lines. 
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Figure  1  Figure  2 


In  Figure  1,  The  maximum  may  be  in  The  interval  (0,  x  ) 

2 

and  in  Figure  2,  it  be  in  the  interval  (x.,  x^).  Such  an 
interval  is  called  the  interval  of  uncertainty.  In  general 
the  interval  of  uncertainty  is  (x^  xk+i>.  The  length 
of  the  interval  of  uncertainty  is  given  by  Hx^,  k). 

Several  search  plans  based  on  the  interval  of  uncertainty 
are  given  below. 

Minimax  Search.  A  plan  which  minimizes  the  maximum 
interval  of  uncertainty.  That  is, 

min  max  l  (x^,k). 

X1 »  x2’  * - ’  xn  1ikln 


Uniform  Pairs  Search.  It  requires  that  the  intervals 


chosen  should  be  of  uniform  length.  One  such  plan  is  to 
take 

x. 


(1+ £>[££] 


1 


-  -  [£]}e 


where  ta]  denotes  the  integral  part  of  a. 
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Other  plans  including  the  Fibonacci  Search  or  Sequential 
Search  plan,whieh  is  based  on  the  Fibonacci  sequence, and 
Golden  Section  Search  plan, are  also  used  in  practice.  For 
literature  on  optimum  seeking  methods,  see  Wilde  (1964). 

An  important  class  of  optimum  seeking  procedures  is  con¬ 
cerned  with  optimizing  the  regression  function  in 
statistics.  Such  procedures  have  become  known  as  Response 
Surface  Methods.  We  shall  discuss  some  elements  of  this 
methodology  in  the  next  section. 


6 .  Response  Surface  Methods 

The  response  surface  methodology  was  developed  to 
solve  some  problems  in  chemical  investigation.  •  However, 

its  use  became  universal  and  in  simulation  methodology 

response  surface  techniques  are  very  commonly  used.  The 

problem  can  be  stated  as  follows.  Let  a  region  R,  of 

k  dimensions  be  called  the  factor  space  of  with  points 

x  '  <x^,  x^,  x^)’.  Let  the  mean,  u  of  a  response  yu 

depend  on  the  factors  x  through  the  function 

%u 

Pu  =  $(x„>.  (6.1) 


Let  yy  have  variance  a2.  The  problem  then  is  to  find  a 
point  x°  in  the  smallest  number  of  experiments  so  as  to 


■"lists# 
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optimize  over  the  region  RK. 

This  classical  problem  was  stated  by  Hotelling  (1941) 
and  Friedman  and  Savage  (1947).  Box  and  Wilson  (1951) 
provided  the  basic  framework  to  develop  optimal  response 
surface  designs  and  their  techniques  have  found  consider¬ 
able  use  in  many  applications.  Myers  (1971)  has  collected 
the  available  material  in  a  book  on  response  surfaces. 

We  discuss  elements  of  response  surface  methodology  based 
on  the  paper  of  Box  and  Wilson.  One  of  their  major  con¬ 
tributions  was  to  develop  new  types  of  designs  in  place 
of  complete  factorial  designs. 

Let  the  distance,  r,  from  the  origin  to  the  point  x 

'V 

be  Euclidean,  with 


i  \  2 
r  =  >x^. 

The  object  here  is  to  choose  x  in  such  a  way  that 

0(x)  -  <p ( 0 ) 

is  maximized  with  the  constraints  in  (6.2). 

Using  Lagrange’ s  method  ,  we  maximize 

4(x)  =  $(x)  -  0(0)  -  i  X  Tx?  . 

*\t 


(6.2) 


(6.3) 


(6.4) 


The  stationary  solution  is  given  by  equating  to  zero  the 
partial  derivatives  with  respect  to  .  We  have 


(6.5) 
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Squaring  and  summing  over  all  i  and  simplifying*  we  get 


■  i  I  I  '-srr'2!1” 


(6.6) 


That  is,  the  maximizing  point  should  have  coordinates 
proportional  to  the  derivatives  of  $. 

Suppose  the  conditions  of  Taylor's  expansion  for  $(x) 
in  the  neighborhood  of  the  origin  hold,  then  4>(x)  can  be 
expanded  to  linear,  quadratic  and  higher  order  terms.  If 
we  assume  that  second  and  higher  order  terms  in  the  ex¬ 
pansion  of  $  are  zero,  then,  $(x)  is  approximated  by  a 

*\t 

linear  function  of  the  following  type: 


*(x)  =  g0  ♦  B1x1  +  B2x2  +  ...  +  Bkxk 


(6.7) 


Then , 


3*(x) 

"TxT"  -  6i> 


i  —  1,  2  ,  . .  .  ,  3c 


(6.8) 


and  the  optimal  x^  are  proportional  to  B^.  Similarly 
expressions  involving  coefficients  of  linear  and  quadratic 
terms  can  be  obtained  if  the  Taylor’s  expansion  of  <!>(x) 
does  not  contain  third  and  higher  order  terms.  The  move 
along  the  derivatives  of  the  response  function  gives  the 
steepest  ascent  approach  to  a  maximum. 


-  f  *  -  *3  .  -W* 


-■*  y*.\ v*  r i vV 


For  the  sake  of  clarity  of  presentation,  we  assume 
k  =  2.  Suppose  4>(x)  has  third  and  higher  order  derivatives 
zero.  Hence  we  represent  $(x)  as  follows: 

$(x)  =  6q  +  S1x1  +  6~x2  +  SjjXj  +  612x1x2  ♦  P22x2 

(6.9) 

Using  the  usual  least  squares  theory,  the  regression 
equations  (6.9)  can  be  estimated  by  at  least  six  or  mere 
points,  since  there  are  six  unknown  constants.  As  a  rule, 
one  would  consider  a  complete  3x3  factorial  experiment 


Figure  3 
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with  nine  points  so  as  to  provide  estimates  for  the 
quadratic  regression  (6.3).  However,* Box  and  Wilson  pro¬ 
vided  a  design,  not  of  the  factorial  type  which  has  five 
points  on  the  vertices  of  a  pentagon  and  the  sixth  at  the 
origin.  Such  a  design  would  give  the  estimates  of  the 
coefficients  in  the  regression  model  and  hence  about  the 
derivatives.  These  estimates  then  can  be  used  to  define 
the  path  of  steepest  ascent. 

Several  designs  such  as  fractional  factorials  have 
also  been  used  in  response  surface  techniques  and  are 
available  in  text  books  on  design  of  exeprimeots >  for 
example,  see  Kempthorne  (1978)  and  Myers  (1971)  providing 
a  large  number  of  new  designs  which  are  commonly  applied 
in  response  surface  methodology . 

7 .  Optimal  Design  of  Regression  Experiments 

The  theory  of  optimal  design  of  regression  experi¬ 
ments  is  concerned  with  choosing  the  levels  of  the  inde¬ 
pendent  variable  x  for  the  model 

y  =  f(x) 

so  as  to  optimize  a  certain  function  of  parameters  to  be 
estimated  in  the  model.  We  have  given  several  optimality 
criteria  as  commonly  used  in  optimal  design  theory  in 
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Section  2.  In  simulation  studies  such  criteria  assume 
further  importance  since  the  design  of  a  simulation  may 
require  several  replications  in  a  given  problem.  There 
is  an  extensive  literature  on  optimality  of  designs.  For 
a  recent  survey,  see  Federov  (1972).  Reviews  of  various 
other  aspects  of  optimal  designs  have  been  presented  more 
recently  in  the  statistical  literature.  A  review  of 
D-optimality  for  regression  designs  has  been  given  by 
St.  John  and  Draper  (1975)  with  an  extensive  bibliography. 

A  typical  problem  of  optimal  design  theory  is  of  the 
following  type. 

Example : 

Consider  the  simple  linear  regression  model 


yi  =  80  +  8lxi  +  i  =  1,  2,  n 


(7.1) 


We  assume  that  the  errors  are  uncorrelated  and  have 


common  variance  o' 


Let 


and 


y 

a, 


=  >  y,  > 


.*  yn> ’ »  e 


X 


<v  v 


n 


(7.2) 

(7.3) 


tr  4 


k  r 

<M 

T 

I 


where 


a  - 


n£  (x .  -x) 
1 


(7.7) 


The  estimates  are  given  by 
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Suppose  V<6i)  is  to  be  minimized  to  obtain  optimal 
x^'s.  That  is,  the  optimization  problem  is  to  maximize 

I(x.-x)2.  (7.12) 

L  1 

Assuming  that  x's  are  between  -1  and  1,  the  solution  to 
the  above  problem  is  that  x's  should  be  placed  at  -1  and  1, 
half  at  each  place  to  make  (7.12)  a  maximum.  For  D- 
optimality,  we  maximize  the  determinant  of  S.  That  is, 
again  we  maximize 

n I(Xi-x)2.  (7.13) 

Hence  the  same  answer  obtains  as  in  minimizing  the  variance 
of  er 

Comparisons  of  Optimality  Criteria 

G-optimality  (minimax  optimality)  was  introduced  by 
Smith  (1918)  and  was  exploited  by  Kiefer  and  Wolfowitz 
(19S9).  Wald  (19*43)  used  the  criterion  of  D-optimality  - 
in  some  other  context  and  was  so  named  by  Kiefer  and 
Wolfowitz  (1959).  One  of  the  most  important  results  in 
optimal  design  theory  is  the  equivalence  and  characteri¬ 
zations  of  G-optimality  and  D-optimality  under  various 
conditions.  This  was  established  by  Kiefer  and  Wolfowitz. 
Recently  such  results  have  also  been  extended  to  non¬ 
linear  models  by  White  (1973).  Various  computer  algorithms 
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[t-i  e,  r« 
1 


to  generate  D-optimum  designs  are  available  in  the 
literature.  Essentially  the  algorithm  of  Federov  (1972), 
requires  the  following  steps: 

1)  Select  any  non-degenerate  starting  design, 

2)  Compute  the  dispersion  matrix, 

3)  Find  the  point  of  maximum  variance, 

4)  Add  the  point  of  maximum  variance  to 
the  design,  with  measure  proportional  to 
its  variance 

5)  Update  the  design  measure. 

For  further  details,  the  reader  is  referred  to  the 
exposition  by  St.  John  and  Draper  (1975). 
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